Goto

Collaborating Authors

 Beavercreek


Classical AI vs. LLMs for Decision-Maker Alignment in Health Insurance Choices

Mainali, Mallika, Sureshbabu, Harsha, Sen, Anik, Rauch, Christopher B., Reifsnyder, Noah D., Meyer, John, Turner, J. T., Floyd, Michael W., Molineaux, Matthew, Weber, Rosina O.

arXiv.org Artificial Intelligence

As algorithmic decision-makers are increasingly applied to high-stakes domains, AI alignment research has evolved from a focus on universal value alignment to context-specific approaches that account for decision-maker attributes. Prior work on Decision-Maker Alignment (DMA) has explored two primary strategies: (1) classical AI methods integrating case-based reasoning, Bayesian reasoning, and naturalistic decision-making, and (2) large language model (LLM)-based methods leveraging prompt engineering. While both approaches have shown promise in limited domains such as medical triage, their generalizability to novel contexts remains underexplored. In this work, we implement a prior classical AI model and develop an LLM-based algorithmic decision-maker evaluated using a large reasoning model (GPT -5) and a non-reasoning model (GPT -4) with weighted self-consistency under a zero-shot prompting framework, as proposed in recent literature. We evaluate both approaches on a health insurance decision-making dataset annotated for three target decision-makers with varying levels of risk tolerance (0.0, 0.5, 1.0). In the experiments reported herein, classical AI and LLM-based models achieved comparable alignment with attribute-based targets, with classical AI exhibiting slightly better alignment for a moderate risk profile.


$O(p \log d)$ Subgraph Isomorphism using Stigmergic Swarming Agents

Parunak, H. Van Dyke

arXiv.org Artificial Intelligence

Subgraph isomorphism compares two graphs (sets of nodes joined by edges) to determine whether they contain a common subgraph. Many applications require identifying the subgraph, not just deciding its existence. A particularly common use case, using graphs with labeled nodes, seeks to find instances of a smaller pattern graph with $p$ nodes in the larger data graph with $d$ nodes. The problem is NP-complete, so that naïve solutions are exponential in $p + d$. A wide range of heuristics have been proposed, with the best complexity $O(p^2d^2)$. This paper outlines ASSIST (Approximate Swarming Subgraph Isomorphism through Stigmergy), inspired by the ant colony optimization approach to the traveling salesperson problem. ASSIST is linearithmic, $O(p \log d)$, and also supports matching problems (such as temporally ordered edges, inexact matches, and missing nodes or edges in the data graph) that frustrate other heuristics.


Investigating the Impact of Observation Space Design Choices On Training Reinforcement Learning Solutions for Spacecraft Problems

Hamilton, Nathaniel, Dunlap, Kyle, Hobbs, Kerianne L

arXiv.org Artificial Intelligence

AAS 25-147 INVESTIGATING THE IMP ACT OF OBSERVATION SP ACE DESIGN CHOICES ON TRAINING REINFORCEMENT LEARNING SOLUTIONS FOR SP ACECRAFT PROBLEMS Nathaniel Hamilton *, Kyle Dunlap, and Kerianne L. Hobbs Recent research using Reinforcement Learning (RL) to learn autonomous control for spacecraft operations has shown great success. However, a recent study showed their performance could be improved by changing the action space, i.e. control outputs, used in the learning environment. This has opened the door for finding more improvements through further changes to the environment. The work in this paper focuses on how changes to the environment's observation space can impact the training and performance of RL agents learning the spacecraft inspection task. The studies are split into two groups. The first looks at the impact of sensors that were designed to help agents learn the task. The second looks at the impact of reference frames, reorienting the agent to see the world from a different perspective. The results show the sensors are not necessary, but most of them help agents learn more optimal behavior, and that the reference frame does not have a large impact, but is best kept consistent. INTRODUCTION Autonomous spacecraft operation is a critical capability for managing the growing number of space and increasingly complex operations.


Investigating the Impact of Choice on Deep Reinforcement Learning for Space Controls

Hamilton, Nathaniel, Dunlap, Kyle, Hobbs, Kerianne L.

arXiv.org Artificial Intelligence

For many space applications, traditional control methods are often used during operation. However, as the number of space assets continues to grow, autonomous operation can enable rapid development of control methods for different space related tasks. One method of developing autonomous control is Reinforcement Learning (RL), which has become increasingly popular after demonstrating promising performance and success across many complex tasks. While it is common for RL agents to learn bounded continuous control values, this may not be realistic or practical for many space tasks that traditionally prefer an on/off approach for control. This paper analyzes using discrete action spaces, where the agent must choose from a predefined list of actions. The experiments explore how the number of choices provided to the agents affects their measured performance during and after training. This analysis is conducted for an inspection task, where the agent must circumnavigate an object to inspect points on its surface, and a docking task, where the agent must move into proximity of another spacecraft and "dock" with a low relative speed. A common objective of both tasks, and most space tasks in general, is to minimize fuel usage, which motivates the agent to regularly choose an action that uses no fuel. Our results show that a limited number of discrete choices leads to optimal performance for the inspection task, while continuous control leads to optimal performance for the docking task.


Accurate Crystal Structure Prediction of New 2D Hybrid Organic Inorganic Perovskites

Karimitari, Nima, Baldwin, William J., Muller, Evan W., Bare, Zachary J. L., Kennedy, W. Joshua, Csányi, Gábor, Sutton, Christopher

arXiv.org Artificial Intelligence

Low dimensional hybrid organic-inorganic perovskites (HOIPs) represent a promising class of electronically active materials for both light absorption and emission. The design space of HOIPs is extremely large, since a diverse space of organic cations can be combined with different inorganic frameworks. This immense design space allows for tunable electronic and mechanical properties, but also necessitates the development of new tools for in silico high throughput analysis of candidate structures. In this work, we present an accurate, efficient, transferable and widely applicable machine learning interatomic potential (MLIP) for predicting the structure of new 2D HOIPs. Using the MACE architecture, an MLIP is trained on 86 diverse experimentally reported HOIP structures. The model is tested on 73 unseen perovskite compositions, and achieves chemical accuracy with respect to the reference electronic structure method. Our model is then combined with a simple random structure search algorithm to predict the structure of hypothetical HOIPs given only the proposed composition. Success is demonstrated by correctly and reliably recovering the crystal structure of a set of experimentally known 2D perovskites. Such a random structure search is impossible with ab initio methods due to the associated computational cost, but is relatively inexpensive with the MACE potential. Finally, the procedure is used to predict the structure formed by a new organic cation with no previously known corresponding perovskite. Laboratory synthesis of the new hybrid perovskite confirms the accuracy of our prediction. This capability, applied at scale, enables efficient screening of thousands of combinations of organic cations and inorganic layers.


Collision Avoidance and Geofencing for Fixed-wing Aircraft with Control Barrier Functions

Molnar, Tamas G., Kannan, Suresh K., Cunningham, James, Dunlap, Kyle, Hobbs, Kerianne L., Ames, Aaron D.

arXiv.org Artificial Intelligence

Safety-critical failures often have fatal consequences in aerospace control. Control systems on aircraft, therefore, must ensure the strict satisfaction of safety constraints, preferably with formal guarantees of safe behavior. This paper establishes the safety-critical control of fixed-wing aircraft in collision avoidance and geofencing tasks. A control framework is developed wherein a run-time assurance (RTA) system modulates the nominal flight controller of the aircraft whenever necessary to prevent it from colliding with other aircraft or crossing a boundary (geofence) in space. The RTA is formulated as a safety filter using control barrier functions (CBFs) with formal guarantees of safe behavior. CBFs are constructed and compared for a nonlinear kinematic fixed-wing aircraft model. The proposed CBF-based controllers showcase the capability of safely executing simultaneous collision avoidance and geofencing, as demonstrated by simulations on the kinematic model and a high-fidelity dynamical model.


Deep Reinforcement Learning for Autonomous Spacecraft Inspection using Illumination

van Wijk, David, Dunlap, Kyle, Majji, Manoranjan, Hobbs, Kerianne L.

arXiv.org Artificial Intelligence

This paper investigates the problem of on-orbit spacecraft inspection using a single free-flying deputy spacecraft, equipped with an optical sensor, whose controller is a neural network control system trained with Reinforcement Learning (RL). This work considers the illumination of the inspected spacecraft (chief) by the Sun in order to incentivize acquisition of well-illuminated optical data. The agent's performance is evaluated through statistically efficient metrics. Results demonstrate that the RL agent is able to inspect all points on the chief successfully, while maximizing illumination on inspected points in a simulated environment, using only low-level actions. Due to the stochastic nature of RL, 10 policies were trained using 10 random seeds to obtain a more holistic measure of agent performance. Over these 10 seeds, the interquartile mean (IQM) percentage of inspected points for the finalized model was 98.82%.


Indecision Trees: Learning Argument-Based Reasoning under Quantified Uncertainty

Kent, Jonathan S., Menager, David H.

arXiv.org Artificial Intelligence

Using Machine Learning systems in the real world can often be problematic, with inexplicable black-box models, the assumed certainty of imperfect measurements, or providing a single classification instead of a probability distribution. This paper introduces Indecision Trees, a modification to Decision Trees which learn under uncertainty, can perform inference under uncertainty, provide a robust distribution over the possible labels, and can be disassembled into a set of logical arguments for use in other reasoning systems.


Anticipatory Thinking Challenges in Open Worlds: Risk Management

Amos-Binks, Adam, Dannenhauer, Dustin, Gilpin, Leilani H.

arXiv.org Artificial Intelligence

Anticipatory thinking drives our ability to manage risk - identification and mitigation - in everyday life, from bringing an umbrella when it might rain to buying car insurance. As AI systems become part of everyday life, they too have begun to manage risk. Autonomous vehicles log millions of miles, StarCraft and Go agents have similar capabilities to humans, implicitly managing risks presented by their opponents. To further increase performance in these tasks, out-of-distribution evaluation can characterize a model's bias, what we view as a type of risk management. However, learning to identify and mitigate low-frequency, high-impact risks is at odds with the observational bias required to train machine learning models. StarCraft and Go are closed-world domains whose risks are known and mitigations well documented, ideal for learning through repetition. Adversarial filtering datasets provide difficult examples but are laborious to curate and static, both barriers to real-world risk management. Adversarial robustness focuses on model poisoning under the assumption there is an adversary with malicious intent, without considering naturally occurring adversarial examples. These methods are all important steps towards improving risk management but do so without considering open-worlds. We unify these open-world risk management challenges with two contributions. The first is our perception challenges, designed for agents with imperfect perceptions of their environment whose consequences have a high impact. Our second contribution are cognition challenges, designed for agents that must dynamically adjust their risk exposure as they identify new risks and learn new mitigations. Our goal with these challenges is to spur research into solutions that assess and improve the anticipatory thinking required by AI agents to manage risk in open-worlds and ultimately the real-world.


A Framework for Characterizing Novel Environment Transformations in General Environments

Molineaux, Matthew, Dannenhauer, Dustin, Kildebeck, Eric

arXiv.org Artificial Intelligence

To be robust to surprising developments, an intelligent agent must be able to respond to many different types of unexpected change in the world. To date, there are no general frameworks for defining and characterizing the types of environment changes that are possible. We introduce a formal and theoretical framework for defining and categorizing environment transformations, changes to the world an agent inhabits. We introduce two types of environment transformation: R-transformations which modify environment dynamics and T-transformations which modify the generation process that produces scenarios. We present a new language for describing domains, scenario generators, and transformations, called the Transformation and Simulator Abstraction Language (T-SAL), and a logical formalism that rigorously defines these concepts. Then, we offer the first formal and computational set of tests for eight categories of environment transformations. This domain-independent framework paves the way for describing unambiguous classes of novelty, constrained and domain-independent random generation of environment transformations, replication of environment transformation studies, and fair evaluation of agent robustness.